{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "# Build RAG using Zilliz Cloud Pipelines\n",
    "\n",
    "[Zilliz Cloud Pipelines](https://docs.zilliz.com/docs/pipelines) is AI-powered retrieval service. It simplifies the maintenance of information retrieval system by providing ingestion and search pipelines as easy-to-use API service. As an AI application developer, with quality optimization and devops taken care of, you can focus on building AI applications tailored to your specific use case.\n",
    "\n",
    "In this notebook, we show how to use [Zilliz Cloud Pipelines](https://zilliz.com/zilliz-cloud-pipelines) to build a simple yet scalable [Retrieval Augmented Generation (RAG)](https://zilliz.com/use-cases/llm-retrieval-augmented-generation) application. Retrieval is at the heart of RAG solution, which typically involves maintaining a knowledge base with document pieces, hosting an embedding model and using vector database as retrieval engine. With Zilliz Cloud Pipelines, you don't need to deal with such a complex tech stack. Everything can be done with an API call.\n",
    "\n",
    "We first create the an Ingestion pipeline for text indexing and a Search pipeline for knowledge retrieval. Then we run Ingestion pipeline by API call to import given text to establish the knowledge base. Finally, we build an RAG application that runs Search pipeline to conduct Retrieval Augmented Generation.\n",
    "\n",
    "![](../../images/rag_and_pipeline_text.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Setup\n",
    "### Prerequisites\n",
    "Please make sure you have a Serverless cluster in Zilliz Cloud. If not already, you can [sign up for free](https://cloud.zilliz.com/signup?utm_source=referral&utm_medium=partner&utm_campaign=2023-12-21_github-docs_zilliz-pipeline-rag_github).\n",
    "\n",
    "To learn how to create a Serverless cluster and get your CLOUD_REGION, CLUSTER_ID, API_KEY and PROJECT_ID, please refer to this [page](https://docs.zilliz.com/docs/create-cluster) for more details."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the Serverless Cluster created, please get the cluster id, API key and project id as shown and fill in the following code:\n",
    "\n",
    "![](../../images/zilliz_api_key_cluster_id.jpeg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "PROJECT_ID = 'YOUR_PROJECT_ID'\n",
    "API_KEY = 'YOUR_API_KEY'\n",
    "CLUSTER_ID = 'YOUR_CLUSTER_ID'\n",
    "CLOUD_REGION = 'gcp-us-west1'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Create an ingestion pipeline\n",
    "[Ingestion pipelines](https://docs.zilliz.com/docs/understanding-pipelines#ingestion-pipelines) can transform unstructured data into searchable vector embeddings and store them in Zilliz Cloud Vector Database.\n",
    "\n",
    "In the Ingestion pipeline, you can specify functions to customize its behavior. The input data that Ingestion pipeline expects also depends on the specified functions. Currently, Ingestion pipeline allows the following types of functions:\n",
    "\n",
    "- The `INDEX_TEXT` function accepts a list of text as input. It converts each text to a vector embedding and maps an input field (text_list) to two fields (text, embedding) in the corresponding collection (auto-generated if not exist).\n",
    "- The `INDEX_DOC` function expects a document as input. It splits the input text document into chunks and generates a vector embedding for each chunk. This function maps an input field (doc_url) to four output fields (doc_name, chunk_id, chunk_text, and embedding) in the corresponding collection (auto-generated if not exist).\n",
    "- The `INDEX_IMAGE` function requires an image data and its unique id as input. It generates the image embedding and maps two given input fields (image_url, image_id) to two output fields (image_id, embedding) in the corresponding collection (auto-generated if not exist).\n",
    "- The `PRESERVE` function stores a user-defined input as additional [scalar](https://milvus.io/docs/scalar_index.md) field in the corresponding collection. This is typically used to store meta information of the core inputs, such as publisher info and tags that describes the property.\n",
    "\n",
    "Please note that an ingestion function must contain one and only one index function as the core function, while the preserve function is optional.\n",
    "\n",
    "In the following example, we will create an Ingestion pipeline with an `INDEX_TEXT` function and a `PRESERVE` function. As part of creating the Ingestion pipeline, a vector database collection named `my_text_collection` will be created in the cluster. It contains five fields:\n",
    "- `id` as auto-generated for each entity\n",
    "- `text`, `embedding` as defined by `INDEX_TEXT` function\n",
    "- `title` as defined by `PRESERVE` function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'code': 200, 'data': {'pipelineId': 'pipe-f119d6273a5ce19f65767f', 'name': 'my_ingestion_pipeline', 'type': 'INGESTION', 'description': 'A pipeline that generates text embeddings and stores title information.', 'status': 'SERVING', 'functions': [{'name': 'index_my_text', 'action': 'INDEX_TEXT', 'inputFields': ['text_list'], 'language': 'ENGLISH', 'embedding': 'zilliz/bge-base-en-v1.5'}, {'name': 'title_info', 'action': 'PRESERVE', 'inputField': 'title', 'outputField': 'title', 'fieldType': 'VarChar'}], 'clusterId': 'in03-d4426aaee81eb7e', 'collectionName': 'my_text_collection', 'totalTokenUsage': 0}}\n"
     ]
    }
   ],
   "source": [
    "import requests\n",
    "\n",
    "headers = {\n",
    "    \"Content-Type\": \"application/json\",\n",
    "    \"Accept\": \"application/json\",\n",
    "    \"Authorization\": f\"Bearer {API_KEY}\"\n",
    "}\n",
    "\n",
    "create_pipeline_url = f\"https://controller.api.{CLOUD_REGION}.zillizcloud.com/v1/pipelines\"\n",
    "\n",
    "collection_name = 'my_text_collection'\n",
    "embedding_service = \"zilliz/bge-base-en-v1.5\"\n",
    "\n",
    "data = {\n",
    "    \"name\": \"my_ingestion_pipeline\",\n",
    "    \"description\": \"A pipeline that generates text embeddings and stores title information.\",\n",
    "    \"type\": \"INGESTION\",\n",
    "    \"projectId\": PROJECT_ID,\n",
    "    \"clusterId\": CLUSTER_ID,\n",
    "    \"collectionName\": collection_name,\n",
    "    \"functions\": [\n",
    "        {\n",
    "            \"name\": \"index_my_text\",\n",
    "            \"action\": \"INDEX_TEXT\",\n",
    "            \"language\": \"ENGLISH\",\n",
    "            \"embedding\": embedding_service\n",
    "        },\n",
    "        {\n",
    "            \"name\": \"title_info\",\n",
    "            \"action\": \"PRESERVE\",\n",
    "            \"inputField\": \"title\",\n",
    "            \"outputField\": \"title\",\n",
    "            \"fieldType\": \"VarChar\"\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "response = requests.post(create_pipeline_url, headers=headers, json=data)\n",
    "print(response.json())\n",
    "ingestion_pipe_id = response.json()[\"data\"][\"pipelineId\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "After successful creation, it will return a pipeline ID. We will run this pipeline later with pipeline ID to ingest text inputs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Create a search pipeline\n",
    "[Search pipelines](https://docs.zilliz.com/docs/understanding-pipelines#search-pipelines) enables semantic search by converting a query string into a vector embedding and then retrieving top-K nearest neighbour vectors, each vector represents a chunk of ingested document and carries other associated information such as file name and preserved properties.\n",
    "\n",
    "A search pipeline contains a search function from the following types, in which you need to set the the cluster and collection to search from:\n",
    "\n",
    "- The `SEARCH_DOC_CHUNK` function expects a user query as input and returns relevant doc chunks from the knowledge base.\n",
    "- The `SEARCH_TEXT` function expects a user query as input and returns relevant text entities from the knowledge base.\n",
    "- The `SEARCH_IMAGE` function expects an image url as input, which will output data entities of most similar images.\n",
    "\n",
    "In this example, we will need a `SEARCH_TEXT` function to enable the text retrieval.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'code': 200, 'data': {'pipelineId': 'pipe-25c2ba7ab0726aa2e11e70', 'name': 'my_search_pipeline', 'type': 'SEARCH', 'description': 'A pipeline that receives text and search for semantically similar texts.', 'status': 'SERVING', 'functions': [{'name': 'search_chunk_text_and_title', 'action': 'SEARCH_TEXT', 'inputFields': ['query_text'], 'clusterId': 'in03-d4426aaee81eb7e', 'collectionName': 'my_text_collection', 'reranker': 'zilliz/bge-reranker-base', 'embedding': 'zilliz/bge-base-en-v1.5'}], 'totalTokenUsage': 0}}\n"
     ]
    }
   ],
   "source": [
    "data = {\n",
    "    \"projectId\": PROJECT_ID,\n",
    "    \"name\": \"my_search_pipeline\",\n",
    "    \"description\": \"A pipeline that receives text and search for semantically similar texts.\",\n",
    "    \"type\": \"SEARCH\",\n",
    "    \"functions\": [\n",
    "        {\n",
    "            \"name\": \"search_text_and_title\",\n",
    "            \"action\": \"SEARCH_TEXT\",\n",
    "            \"embedding\": embedding_service,\n",
    "            \"reranker\": \"zilliz/bge-reranker-base\", # optional, this will rerank search results by the reranker service\n",
    "            \"clusterId\": CLUSTER_ID,\n",
    "            \"collectionName\": collection_name,\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "response = requests.post(create_pipeline_url, headers=headers, json=data)\n",
    "\n",
    "print(response.json())\n",
    "search_pipe_id = response.json()[\"data\"][\"pipelineId\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Similarly, after successful creation, it will return a pipeline ID. We will run this pipeline later and will use this pipeline ID."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In addition to the creating pipelines through RESTful API as introduced in this notebook, you can also create pipelines through Web UI with a few clicks. Check the [documentation](https://docs.zilliz.com/docs/pipelines-ingest-search-delete-data) to learn more about how to ingest, search, and delete different types of data (text, document, image, etc.)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### Run ingestion pipeline\n",
    "\n",
    "The text ingestion pipeline accepts a list of text data as input. In the following demo, we run the ingestion pipeline with text pieces and subheadings from the sample blog: [What Milvus version to start with](https://milvus.io/blog/what-milvus-version-to-start-with.md)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'code': 200, 'data': {'token_usage': 225, 'num_entities': 3, 'ids': [449431798276845977, 449431798276845978, 449431798276845979]}}\n",
      "{'code': 200, 'data': {'token_usage': 135, 'num_entities': 2, 'ids': [449431798276845981, 449431798276845982]}}\n",
      "{'code': 200, 'data': {'token_usage': 136, 'num_entities': 2, 'ids': [449431798276845984, 449431798276845985]}}\n"
     ]
    }
   ],
   "source": [
    "run_pipeline_url = f\"https://controller.api.{CLOUD_REGION}.zillizcloud.com/v1/pipelines/{ingestion_pipe_id}/run\"\n",
    "\n",
    "milvus_lite_data = {\n",
    "    \"data\":\n",
    "        {\n",
    "            \"text_list\": [\n",
    "                \"As the name suggests, Milvus Lite is a lightweight version that integrates seamlessly with Google Colab and Jupyter Notebook. It is packaged as a single binary with no additional dependencies, making it easy to install and run on your machine or embed in Python applications. Additionally, Milvus Lite includes a CLI-based Milvus standalone server, providing flexibility for running it directly on your machine. Whether you embed it within your Python code or utilize it as a standalone server is entirely up to your preference and specific application requirements.\",\n",
    "                \"Milvus Lite is ideal for rapid prototyping and local development, offering support for quick setup and experimentation with small-scale datasets on your machine. However, its limitations become apparent when transitioning to production environments with larger datasets and more demanding infrastructure requirements. As such, while Milvus Lite is an excellent tool for initial exploration and testing, it may not be suitable for deploying applications in high-volume or production-ready settings.\",\n",
    "                \"Milvus Lite is perfect for prototyping on your laptop.\"\n",
    "            ],\n",
    "            \"title\": 'Milvus Lite'\n",
    "        }\n",
    "}\n",
    "\n",
    "milvus_standalone_data = {\n",
    "    \"data\":\n",
    "        {\n",
    "            \"text_list\": [\n",
    "                \"Milvus Standalone is a mode of operation for the Milvus vector database system where it operates independently as a single instance without any clustering or distributed setup. Milvus runs on a single server or machine in this mode, providing functionalities such as indexing and searching for vectors. It is suitable for situations where the data and traffic volume scale is relatively small and does not require the distributed capabilities provided by a clustered setup.\",\n",
    "                \"Milvus Standalone offers high performance and flexibility for conducting vector searches on your datasets, making it suitable for smaller-scale deployments, CI/CD, and offline deployments when you have no Kubernetes support.\"\n",
    "            ],\n",
    "            \"title\": 'Milvus Standalone'\n",
    "        }\n",
    "}\n",
    "\n",
    "milvus_cluster_data = {\n",
    "    \"data\":\n",
    "        {\n",
    "            \"text_list\": [\n",
    "                \"Milvus Cluster is a mode of operation for the Milvus vector database system where it operates and is distributed across multiple nodes or servers. In this mode, Milvus instances are clustered together to form a unified system that can handle larger volumes of data and higher traffic loads compared to a standalone setup. Milvus Cluster offers scalability, fault tolerance, and load balancing features, making it suitable for scenarios that need to handle big data and serve many concurrent queries efficiently.\",\n",
    "                \"Milvus Cluster provides unparalleled availability, scalability, and cost optimization for enterprise-grade workloads, making it the preferred choice for large-scale, highly available production environments.\"\n",
    "            ],\n",
    "            \"title\": 'Milvus Cluster'\n",
    "        }\n",
    "}\n",
    "\n",
    "for data in [milvus_lite_data, milvus_standalone_data, milvus_cluster_data]:\n",
    "    response = requests.post(run_pipeline_url, headers=headers, json=data)\n",
    "    print(response.json())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "Now we have successfully ingested the text pieces with corresponding titles and embeddings into the vector database. If you want to inspect the data in the collection, you can use the Data Preview tool in [Zilliz Cloud web UI](https://cloud.zilliz.com)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "## Build RAG application with Search pipeline\n",
    "\n",
    "### Run search pipeline\n",
    "The first step in building an RAG app is to retrieve information pieces most relevant to the question from a knowledge base (typically a vector database collection).\n",
    "\n",
    "This is as simple as running a Search pipeline that we just created above. Following is how to run a Search pipeline with query text and specifications, and we wrap this run with a function that can be used in the RAG app we will show shortly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'code': 200,\n",
      " 'data': {'result': [{'distance': 0.8722565174102783,\n",
      "                      'id': 449431798276845977,\n",
      "                      'text': 'As the name suggests, Milvus Lite is a '\n",
      "                              'lightweight version that integrates seamlessly '\n",
      "                              'with Google Colab and Jupyter Notebook. It is '\n",
      "                              'packaged as a single binary with no additional '\n",
      "                              'dependencies, making it easy to install and run '\n",
      "                              'on your machine or embed in Python '\n",
      "                              'applications. Additionally, Milvus Lite '\n",
      "                              'includes a CLI-based Milvus standalone server, '\n",
      "                              'providing flexibility for running it directly '\n",
      "                              'on your machine. Whether you embed it within '\n",
      "                              'your Python code or utilize it as a standalone '\n",
      "                              'server is entirely up to your preference and '\n",
      "                              'specific application requirements.',\n",
      "                      'title': 'Milvus Lite'},\n",
      "                     {'distance': 0.3541138172149658,\n",
      "                      'id': 449431798276845978,\n",
      "                      'text': 'Milvus Lite is ideal for rapid prototyping and '\n",
      "                              'local development, offering support for quick '\n",
      "                              'setup and experimentation with small-scale '\n",
      "                              'datasets on your machine. However, its '\n",
      "                              'limitations become apparent when transitioning '\n",
      "                              'to production environments with larger datasets '\n",
      "                              'and more demanding infrastructure requirements. '\n",
      "                              'As such, while Milvus Lite is an excellent tool '\n",
      "                              'for initial exploration and testing, it may not '\n",
      "                              'be suitable for deploying applications in '\n",
      "                              'high-volume or production-ready settings.',\n",
      "                      'title': 'Milvus Lite'}],\n",
      "          'token_usage': 34}}\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'text': 'As the name suggests, Milvus Lite is a lightweight version that integrates seamlessly with Google Colab and Jupyter Notebook. It is packaged as a single binary with no additional dependencies, making it easy to install and run on your machine or embed in Python applications. Additionally, Milvus Lite includes a CLI-based Milvus standalone server, providing flexibility for running it directly on your machine. Whether you embed it within your Python code or utilize it as a standalone server is entirely up to your preference and specific application requirements.',\n",
       "  'title': 'Milvus Lite'},\n",
       " {'text': 'Milvus Lite is ideal for rapid prototyping and local development, offering support for quick setup and experimentation with small-scale datasets on your machine. However, its limitations become apparent when transitioning to production environments with larger datasets and more demanding infrastructure requirements. As such, while Milvus Lite is an excellent tool for initial exploration and testing, it may not be suitable for deploying applications in high-volume or production-ready settings.',\n",
       "  'title': 'Milvus Lite'}]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pprint\n",
    "\n",
    "\n",
    "def retrieval_with_pipeline(question, search_pipe_id, top_k=2, verbose=False):\n",
    "    run_pipeline_url = f\"https://controller.api.{CLOUD_REGION}.zillizcloud.com/v1/pipelines/{search_pipe_id}/run\"\n",
    "\n",
    "    data = {\n",
    "        \"data\": {\n",
    "            \"query_text\": question\n",
    "        },\n",
    "        \"params\": {\n",
    "            \"limit\": top_k,\n",
    "            \"offset\": 0,\n",
    "            \"outputFields\": [\n",
    "                \"text\",\n",
    "                \"title\"\n",
    "            ],\n",
    "        }\n",
    "    }\n",
    "    response = requests.post(run_pipeline_url, headers=headers, json=data)\n",
    "    if verbose:\n",
    "        pprint.pprint(response.json())\n",
    "    results = response.json()[\"data\"][\"result\"]\n",
    "    retrieved_texts = [{'text': result['text'], 'title': result['title']} for result in results]\n",
    "    return retrieved_texts\n",
    "\n",
    "\n",
    "question = 'Which Milvus should I choose if I want to use in the jupyter notebook with a small scale of data?'\n",
    "retrieval_with_pipeline(question, search_pipe_id, top_k=2, verbose=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "We can see that when we ask a question, this search run can return the top k knowledge fragments we need. This is also a basis for forming RAG."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Build a chatbot powered by RAG \n",
    "With the above convenient helper function `retrieval_with_pipeline`, we can retrieve the knowledge ingested into the vector database.\n",
    "Below, we show a simple RAG app that can answer based on the knowledge we have ingested previously. It uses OpenAI `gpt-3.5-turbo` as LLM and a simple prompt. To test it, you can replace with your own OpenAI API Key."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import os\n",
    "from openai import OpenAI\n",
    "\n",
    "\n",
    "client = OpenAI()\n",
    "client.api_key = os.getenv('OPENAI_API_KEY')  # your OpenAI API key\n",
    "\n",
    "class Chatbot:\n",
    "    def __init__(self, search_pipe_id):\n",
    "        self._search_pipe_id = search_pipe_id\n",
    "\n",
    "    def retrieve(self, query: str) -> list:\n",
    "        \"\"\"\n",
    "        Retrieve relevant text with Zilliz Cloud Pipelines.\n",
    "        \"\"\"\n",
    "        results = retrieval_with_pipeline(query, self._search_pipe_id, top_k=2)\n",
    "        return results\n",
    "\n",
    "    def generate_answer(self, query: str, context_str: list) -> str:\n",
    "        \"\"\"\n",
    "        Generate answer based on context, which is from the result of Search pipeline run.\n",
    "        \"\"\"\n",
    "        completion = client.chat.completions.create(\n",
    "            model=\"gpt-3.5-turbo\",\n",
    "            temperature=0,\n",
    "            messages=\n",
    "            [\n",
    "                {\"role\": \"user\",\n",
    "                 \"content\":\n",
    "                     f\"We have provided context information below. \\n\"\n",
    "                     f\"---------------------\\n\"\n",
    "                     f\"{context_str}\"\n",
    "                     f\"\\n---------------------\\n\"\n",
    "                     f\"Given this information, please answer the question: {query}\"\n",
    "                 }\n",
    "            ]\n",
    "        ).choices[0].message.content\n",
    "        return completion\n",
    "\n",
    "    def chat_with_rag(self, query: str) -> str:\n",
    "        context_str = self.retrieve(query)\n",
    "        completion = self.generate_answer(query, context_str)\n",
    "        return completion\n",
    "\n",
    "    def chat_without_rag(self, query: str) -> str:\n",
    "        return client.chat.completions.create(\n",
    "            model=\"gpt-3.5-turbo\",\n",
    "            temperature=0,\n",
    "            messages=\n",
    "            [\n",
    "                {\"role\": \"user\",\n",
    "                 \"content\": query\n",
    "                 }\n",
    "            ]\n",
    "        ).choices[0].message.content\n",
    "\n",
    "chatbot = Chatbot(search_pipe_id)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "This implements an RAG chatbot, it will use Search pipeline to retrieve the most relevant text pieces from the database, and enhance the answer quality with it. Let's see how it works in action!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "### Chat with RAG"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Based on the context provided, you should choose Milvus Lite if you want to use it in a Jupyter Notebook with a small scale of data. Milvus Lite is specifically designed for rapid prototyping and local development, offering support for quick setup and experimentation with small-scale datasets on your machine. It is lightweight, easy to install, and integrates seamlessly with Google Colab and Jupyter Notebook.'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "question = 'Which Milvus should I choose if I want to use in the jupyter notebook with a small scale of data?'\n",
    "chatbot.chat_with_rag(question)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "The ground truth content in the original knowledge text is:\n",
    "> As the name suggests, **Milvus Lite is a lightweight version that integrates seamlessly with Google Colab and Jupyter Notebook.** It is packaged as a single binary with no additional dependencies, making it easy to install and run on your machine or embed in Python applications. Additionally, Milvus Lite includes a CLI-based Milvus standalone server, providing flexibility for running it directly on your machine. Whether you embed it within your Python code or utilize it as a standalone server is entirely up to your preference and specific application requirements.\n",
    "\n",
    "We can tell that the RAG we built successfully answers this question that requires deep domain knowledge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'If you are working with a small scale of data in a Jupyter notebook, you may want to consider using Milvus CE (Community Edition). Milvus CE is a free and open-source vector database that is suitable for small-scale projects and experimentation. It is easy to set up and use in a Jupyter notebook environment, making it a good choice for beginners or those working with limited data. Additionally, Milvus CE offers a range of features and functionalities that can help you efficiently store and query your data in a vectorized format.'"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chatbot.chat_without_rag(question)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "In opposite, the LLM without RAG doesn't have domain knowledge required for this question, even worse, it outputs incorrect answer. This is a typical example of the so called [hallucination](https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)) problem of LLM."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's how to use Zilliz Cloud Pipelines to build RAG applications. To learn more, you can refer to https://docs.zilliz.com/docs/pipelines for detailed information.\n",
    "\n",
    "If you have any question, feel free to contact us at support@zilliz.com"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
